Order statistics for histogram data and a box plot visualization tool

نویسندگان

  • Rosanna Verde
  • Antonio Balzanella
  • Antonio Irpino
چکیده

Abstract. This paper deals with new descriptive statistics for histogram data, in the framework of symbolic data analysis. A main contribution consists in defining the main order statistics (median and quartiles) of a histogram variable using the quantile functions associated with the corresponding empirical distribution functions of the observed histograms. The definition of an order relationship between quantile functions is based on an appropriate probabilistic metric: the ` Wasserstein distance. Starting from the median and quartile functions definition, we extend the classic box-plot representation for set of quantile functions. Finally, we propose new measures of variability and skewness for a histogram variable associated with this representation. An application on real data allows us to corroborate the proposed measures and the new box-plot visualization tool.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Visual Summary Statistics UUCS-07-004

Traditionally, statistical summaries of categorical data often have been visualized using graphical plots of central moments (e.g., mean and standard deviation), or cumulants (e.g., median and quartiles) by box plots. In this work we reexamine the box plot and its relatives and develop a new hybrid summary plot that combines moment, cumulant, and density information. In view of the important ro...

متن کامل

Data Visualization of Outliers from a Health Research Perspective Using SAS/GRAPH and the Annotate Facility

SAS/GRAPH is a powerful tool for customizing the box plot to detect and identify outliers. This paper shows how to use the ANNOTATE facility and annotate data set to customize box plots and profile plots of outliers using data from a dietary-health study. This paper assumes: • A working knowledge of basic SAS/GRAPH procedures. • The ability to display or print graphics on your operating system...

متن کامل

Visualization and Exploration of Time-varying and Diffusion Tensor Medical Image Data Sets

In this work, we propose and compare several methods for the visualization and exploration of time-varying volumetric medical images based on the temporal characteristics of the data. The principle idea is to consider a time-varying data set as a 3D volume where each voxel contains a time-activity curve (TAC). We define and appraise three different TAC similarity measures. Based on these measur...

متن کامل

Practice of Epidemiology More Than Numbers: The Power of Graphs in Meta-Analysis

In meta-analysis, the assessment of graphs is widely used in an attempt to identify or rule out heterogeneity and publication bias. A variety of graphs are available for this purpose. To date, however, there has been no comparative evaluation of the performance of these graphs. With the objective of assessing the reproducibility and validity of graph ratings, the authors simulated 100 meta-anal...

متن کامل

Design and Implementation of a System for Interactive High-Dimensional Vector Field Visualization

Although the challenge of 2D flow visualization is deemed virtually solved as a result of the tremendous amount of effort invested into this problem, high-dimensional flow visualization, (e.g. the visualization of flow on surfaces in 3D (2.5D), the volumetric flow (3D), and flow with several attributes (nD) ), still poses many challenges and unsolved problems. In this paper we describe the desi...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2015